GARUDA: A System for Large-Scale Mining of Statistically Significant Connected Subgraphs
نویسندگان
چکیده
Unraveling “interesting” subgraphs corresponding to disease/crime hotspots or characterizing habitation shift patterns is an important graph mining task. With the availability and growth of large-scale real-world graphs, mining for such subgraphs has become the need of the hour for graph miners as well as non-technical end-users. In this demo, we present GARUDA, a system capable of mining large-scale graphs for statistically significant subgraphs in a scalable manner, and provide: (1) a detailed description of the various features and user-friendly GUI of GARUDA; (2) a brief description of the system architecture; and (3) a demonstration scenario for the audience. The demonstration showcases one real graph mining task as well as its ability to scale to large real graphs, portraying speedups of upto 8–10 times over the state-of-the-art MSCS algorithm.
منابع مشابه
Fast Hierarchy Construction for Dense Subgraphs
Discovering dense subgraphs and understanding the relations among them is a fundamental problem in graph mining. We want to not only identify dense subgraphs, but also build a hierarchy among them (e.g., larger but sparser subgraphs formed by two smaller dense subgraphs). Peeling algorithms (k-core, k-truss, and nucleus decomposition) have been effective to locate many dense subgraphs. However,...
متن کاملStatistically significant subgraphs for genome-wide association study
Genome-wide association studies (GWAS) have been widely used for understanding the associations of single-nucleotide polymorphisms (SNPs) with a disease. GWAS data are often combined with known biological networks, and they have been analyzed using graphmining techniques toward a systems understanding of the biological changes caused by the SNPs. To determine which subgraphs are associated with...
متن کاملDynamic Modelling of a Compressed Air Energy Storage System in a Grid Connected Photovoltaic Plant
The use of photovoltaic (PV) cells in domestic and industrial applications has grown rapidly through the recent years. Constructing PV plants is a very smart measure to produce free electricity in large scales, especially in the countries with higher solar irradiation potential. On the other hand, compressed air energy storage (CAES) has already been proposed to be employed for energy storage a...
متن کاملA Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs
Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...
متن کاملArabesque: A System for Distributed Graph Mining - Extended version
Distributed data processing platforms such as MapReduce and Pregel have substantially simplified the design and deployment of certain classes of distributed graph analytics algorithms. However, these platforms do not represent a good match for distributed graph mining problems, as for example finding frequent subgraphs in a graph. Given an input graph, these problems require exploring a very la...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 9 شماره
صفحات -
تاریخ انتشار 2016